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1. Abstract 


MIME [RFC-2045, RFC-2046, RFC-2047, RFC-2184] and various other 
modern Internet protocols are capable of using many different 
charsets. This in turn means that the ability to label different 
charsets is essential. This registration procedure exists solely to 
associate a specific name or names with a given charset and to give 
an indication of whether or not a given charset can be used in MIME 
text objects. In particular, the general applicability and 
appropriateness of a given registered charset is a protocol issue, 
not a registration issue, and is not dealt with by this registration 
procedure. 


2. Definitions and Notation 
The following sections define various terms used in this document. 
2.1. Requirements Notation 
This document occasionally uses terms that appear in capital letters. 
When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY" 
appear capitalized, they are being used to indicate particular 


requirements of this specification. A discussion of the meanings of 
these terms appears in [RFC-2119]. 
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2.2. Character 


A member of a set of elements used for the organisation, control, or 
representation of data. 


2.3. Charset 


The term "charset" (see historical note below) is used here to refer 
to a method of converting a sequence of octets into a sequence of 
characters. This conversion may also optionally produce additional 
control information such as directionality indicators. 


Note that unconditional and unambiguous conversion in the other 
direction is not required, in that not all characters may be 
representable by a given charset and a charset may provide more than 
one sequence of octets to represent a particular sequence of 
characters. 


This definition is intended to allow charsets to be defined ina 
variety of different ways, from simple single-table mappings such as 
US-ASCII to complex table switching methods such as those that use 


ISO 2022’s techniques, to be used as charsets. However, the 
definition associated with a charset name must fully specify the 
mapping to be performed. In particular, use of external profiling 


information to determine the exact mapping is not permitted. 


HISTORICAL NOTE: The term "character set" was originally used in MIME 
to describe such straightforward schemes as US-ASCII and ISO-8859-1 
which consist of a small set of characters and a simple one-to-one 
mapping from single octets to single characters. Multi-octet 
character encoding schemes and switching techniques make the 
situation much more complex. As such, the definition of this term was 
revised to emphasize both the conversion aspect of the process, and 
the term itself has been changed to "charset" to emphasize that it is 
not, after all, just a set of characters. A discussion of these 
issues as well as specification of standard terminology for use in 
the IETF appears in RFC 2130. 


2.4. Coded Character Set 
A Coded Character Set (CCS) is a mapping from a set of abstract 
characters to a set of integers. Examples of coded character sets are 


ISO 10646 [ISO-10646], US-ASCII [US-ASCII], and the ISO-8859 series 
[ISO-8859]. 
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2.5. Character Encoding Scheme 


A Character Encoding Scheme (CES) is a mapping from a Coded Character 
Set or several coded character sets to a set of octets. A given CES 
is typically associated with a single CCS; for example, UTF-8 applies 
only to ISO 10646. 


3. Registration Requirements 


Registered charsets are expected to conform to a number of 
requirements as described below. 


3.1. Required Characteristics 


Registered charsets MUST conform to the definition of a "charset" 
given above. In addition, charsets intended for use in MIME content 
types under the "text" top-level type must conform to the 
restrictions on that type described in RFC 2045. All registered 
charsets MUST note whether or not they are suitable for use in MIME. 


All charsets which are constructed as a composition of a CCS anda 
CES MUST either include the CCS and CES they are based on in their 
registration or else cite a definition of their CCS and CES that 
appears elsewhere. 


All registered charsets MUST be specified in a stable, openly 
available specification. Registration of charsets whose 
specifications aren’t stable and openly available is forbidden. 


3.2. New Charsets 


This registration mechanism is not intended to be a vehicle for the 
definition of entirely new charsets. This is due to the fact that the 
registration process does NOT contain adequate review mechanisims for 
such undertakings. 


As such, only charsets defined by other processes and standards 
bodies, or specific profiles of such charsets, are eligible for 
registration. 


3.3. Naming Requirements 


One or more names MUST be assigned to all registered charsets. 
Multiple names for the same charset are permitted, but if multiple 
names are assigned a single primary name for the charset MUST be 
identified. All other names are considered to be aliases for the 
primary name and use of the primary name is preferred over use of any 
of the aliases. 
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Each assigned name MUST uniquely identify a single charset. All 
charset names MUST be suitable for use as the value of a MIME content 
type charset parameter and hence MUST conform to MIME parameter value 
syntax. This applies even if the specific charset being registered is 
not suitable for use with the "text" media type. 


Finally, charsets being registered for use with the "text" media type 
MUST have a primary name that conforms to the more restrictive syntax 
of the charset field in MIME encoded-words [RFC-2047, RFC-2184] and 
MIME extended parameter values [RFC-2184]. A combined ABNF definition 
for such names is as follows: 


mime-charset = 1*<Any CHAR except SPACE, CTLs, and cspecials> 
cspecials — LNE / nyon / "<" / wou / wan / wow Die / "n / " 
<"> / wyn / wars / mj / mon / n wow / wee 

CHAR <any ASCII character> $ 0-177, 0.-127.) 

SPACE = ASCII SP, space? ; 40, 32.) 

CTE = <any ASCII control $ 0- 37, 0.- 31.) 
character and DEL> 5 LUT, 127.) 

3.4. Functionality Requirement 


Charsets must function as actual charsets: Registration of things 
that are better thought of as a transfer encoding, as a media type, 
or as a collection of separate entities of another type, is not 
allowed. For example, although HTML could theoretically be thought 
of as a charset, it is really better thought of as a media type and 
as such it cannot be registered as a charset. 


-5. Usage and Implementation Requirements 


Use of a large number of charsets in a given protocol may hamper 
interoperability. However, the use of a large number of undocumented 
and/or unlabelled charsets hampers interoperability even more. 


A charset should therefore be registered ONLY if it adds significant 
functionality that is valuable to a large community, OR if it 
documents existing practice in a large community. Note that charsets 
registered for the second reason should be explicitly marked as being 
of limited or specialized use and should only be used in Internet 
messages with prior bilateral agreement. 


-6. Publication Requirements 


Charset registrations can be published in RFCs, however, 
publication is not required to register a new charset. 


RFC 
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The registration of a charset does not imply endorsement, approval, 
or recommendation by the IANA, IESG, or IETF, or even certification 
that the specification is adequate. It is expected that applicability 
statements for particular applications will be published from time to 
time that recommend implementation of, and support for, charsets that 
have proven particularly useful in those contexts. 


3.7. MIBenum Requirements 


Each registered charset MUST also be assigned a unique enumerated 
integer value. These "MIBenum" values are defined by and used in the 
Printer MIB [RFC-1759]. 


A MIBenum value for each charset will be assigned by IANA at the time 
of registration. 


4. Registration Procedure 


The following procedure has been implemented by the IANA for review 
and approval of new charsets. This is not a formal standards 
process, but rather an administrative procedure intended to allow 
community comment and sanity checking without excessive time delay. 


4.1. Present the Charset to the Community 


Send the proposed charset registration to the "ietf- 
charsets@iana.org" mailing list. This mailing list has been 
established for the sole purpose of reviewing proposed charset 
registrations. Proposed charsets are not formally registered and must 
not be used; the "x-" prefix specified in RFC 2045 can be used until 
registration is complete. 


The intent of the public posting is to solicit comments and feedback 
on the definition of the charset and the name chosen for it over a 
two week period. 


4.2. Charset Reviewer 


When the two week period has passed and the registration proposer is 
convinced that consensus has been achieved, the registration 
application should be submitted to IANA and the charset reviewer. The 
charset reviewer, who is appointed by the IETF Applications Area 
Director(s), either approves the request for registration or rejects 
it. Rejection may occur because of significant objections raised on 
the list or objections raised externally. If the charset reviewer 
considers the registration sufficiently important and controversial, 
a last call for comments may be issued to the full IETF. The charset 
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reviewer may also recommend standards track processing (before or 
after registration) when that appears appropriate and the level of 
specification of the charset is adequate. 


Decisions made by the reviewer must be posted to the ietf-charsets 
mailing list within 14 days. Decisions made by the reviewer may be 
appealed to the IESG. 


4.3. IANA Registration 


Provided that the charset registration has either passed review or 
has been successfully appealed to the IESG, the IANA will register 
the charset, assign a MIBenum value, and make its registration 
available to the community. 


5. Location of Registered Charset List 


Charset registrations will be posted in the anonymous FTP file 
"ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets" and all 
registered charsets will be listed in the periodically issued 
"Assigned Numbers" RFC [currently RFC-1700]. The description of the 
charset may also be published as an Informational RFC by sending it 
to "rfc-editor@isi.edu" (please follow the instructions to RFC 
authors [RFC-2223]). 


6. Registration Template 


To: ietf-charsets@iana.org 
Subject: Registration of new charset 


Charset name(s): 


(All names must be suitable for use as the value of a MIME content- 
type parameter.) 


Published specification(s): 

(A specification for the charset must be openly available that 
accurately describes what is being registered. If a charset is 
defined as a composition of a CCS and a CES then these defintions 


must either be included or referenced.) 


Person & email address to contact for further information: 
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not known to raise any sort of 


security considerations that are appreciably different from those 
already existing in the protocols that employ registered charsets. 
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Full Copyright Statement 
Copyright (C) The Internet Society (1998). All Rights Reserved. 


This document and translations of it may be copied and furnished to 
others, and derivative works that comment on or otherwise explain it 
or assist in its implementation may be prepared, copied, published 
and distributed, in whole or in part, without restriction of any 
kind, provided that the above copyright notice and this paragraph are 
included on all such copies and derivative works. However, this 
document itself may not be modified in any way, such as by removing 
the copyright notice or references to the Internet Society or other 
Internet organizations, except as needed for the purpose of 
developing Internet standards in which case the procedures for 
copyrights defined in the Internet Standards process must be 
followed, or as required to translate it into languages other than 
English. 


The limited permissions granted above are perpetual and will not be 
revoked by the Internet Society or its successors or assigns. 


This document and the information contained herein is provided on an 
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 
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